Performance and power comparison of Thread Level Speculation in SMT and CMP architectures
نویسندگان
چکیده
As technology advances, microprocessors that support multiple threads of execution on a single chip are becoming increasingly common. Improving the performance of general purpose applications by extracting parallel threads is extremely difficult, due to the complex control flow and ambiguous data dependences that are inherent to these applications. Thread-Level Speculation (TLS) enables speculative parallel execution of potentially dependent threads, and ensures correct execution by providing hardware support to detect data dependence violations and to recover from speculation failures. TLS can be supported on a variety of architectures, among them are Chip MultiProcessors (CMP) and Simultaneous MultiThreading (SMT). While there have been numerous papers comparing the performance and power efficiency of SMT and CMP processors under various workloads, relatively little has been done to compare them under the context of TLS. While CMPs utilize smaller and more powerefficient cores, resource sharing and constructive interference between speculative and non-speculative threads can potentially make SMT more power efficient. Thus, this paper aims to fill this void by extending a CMP and a SMT processor to support TLS, and evaluating the performance and power efficiency of the resulting systems with speculative parallel threads extracted for the SPEC2000 benchmark suite. Both SMT and CMP processors have a large variety of configurations, we choose to conduct our study on two architectures with equal die area and the same clock frequency. Our results show that a SMT processor that supports four speculative threads outperforms a CMP processor that supports the same
منابع مشابه
Speculative Precomputation on Chip Multiprocessors
Previous work on speculative precomputation (SP) on simultaneous multithreaded (SMT) architectures has shown significant benefits. The SP techniques improve singlethreaded program performance by utilizing otherwise idle thread contexts to run “helper threads”, which prefetch critical data into shared caches and reduce the time the “main thread” stalls waiting for long latency outstanding loads....
متن کاملComparing the Energy Efficiency of CMP and SMT Architectures for Multimedia Workloads
Chip multiprocessing (CMP) and simultaneous multithreading (SMT) are two recently adopted techniques for improving the throughput of general-purpose processors by using multithreading. These techniques are likely to benefit the increasingly important real-time multimedia workloads, which are inherently multithreaded. These workloads, however, often run in an energy constrained environment. This...
متن کاملThread-Level Speculation on a CMP Can Be Energy Efficient
While Chip Multiprocessors (CMP) with Thread-Level Speculation (TLS) have become the subject of intense research, processor designers in industry have reservations about their practical implementation. An often cited complaint is that TLS is too energy-inefficient to compete against conventional superscalars. This paper challenges the commonly-held view that TLS is energy inefficient. We identi...
متن کاملDynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor
Data prefetching via helper threading has been extensively investigated on Simultaneous MultiThreading (SMT) or Virtual Multi-Threading (VMT) architectures. Although reportedly large cache latency can be hidden by helper threads at runtime, most techniques rely on hardware support to reduce context switch overhead between the main thread and helper thread as well as rely on static profile feedb...
متن کاملTemperature-Aware Design Issues for SMT and CMP Architectures
With increasing power density in modern processors, management of on-chip temperature is fast becoming a bottleneck for chip designers. To address this, beyond conventional power and energy analysis it is necessary to apply temperature-aware analysis. In this paper we present thermal-aware experiments on simultaneous multithreaded (SMT) and chip multiprocessor (CMP) architectures. Both SMT and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007